Skip to content
This repository was archived by the owner on Oct 30, 2024. It is now read-only.

Conversation

@iwilltry42
Copy link
Contributor

@iwilltry42 iwilltry42 commented Oct 28, 2024

  • add: load.gpt tool as a wrapper around knowledge load to transform an input file into Knowledge JSON format and store it to another file.
  • change: ingest and load output final error (if any) as JSON to stdout (and exit with 0) or don't print anything at all. All other logging should go to stderr (slog does that by default, so we're good)
    $ knowledge load ~/Downloads/Tags.xlsx out.json
    {"error": "unsupported file type \"/home/thklein/Downloads/Tags.xlsx\""}
    
  • add: --metadata foo=bar for ingest and load commands (also usable via KNOWLEDGE="foo=bar,baz=bom" in env)
  • change: ingest and load both use the gptscript SDK to read from and write to the workspace if paths are prefixed witth ws:// (to keep knowledge usable as a standalone tool, making it easier to test and evaluate)

@iwilltry42 iwilltry42 marked this pull request as ready for review October 28, 2024 19:08
Copy link
Member

@g-linville g-linville left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only other suggestion is to replace ws:// with workspace:// since ws:// is ordinarily used for WebSocket URLs.

Comment on lines +125 to +128
if strings.HasPrefix(paths[0], "ws://") {
if len(paths) > 1 {
return 0, 0, fmt.Errorf("cannot ingest multiple paths from workspace")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would not catch the case where the user passes a normal path as the first parameter, and a ws:// path as the second.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think path will be "director:///" if it is local, and s3 and r2 s3:///" r2:///`

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we're only checking the very first path for the ws:// prefix. If it's on the second item in the paths slice, it won't get caught.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ws:// is overloaded, but fairly clear with regards to knowledge (which is file-based) I guess 🤷‍♂️

Anyway, those checks are "superficial", as only Otto should be using this stuff.

But you're right and I will fix the checks later on 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm willing to accept these shortcomings (that it only checks the first path and that ws is overloaded).

@g-linville @StrongMonkey if this is your only issue with the PR, can your finish your review?

err := s.run(cmd.Context(), filePath)
if err != nil {
return err
exitErr0(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have to do this because otherwise cobra will return a non-zero if you bubble an err all the way up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup

default: true
globals:
ingestion:
textsplitter:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean you are just using textsplitter until you write the "enhanced" splitter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - but the new one will land today

Copy link
Contributor

@cjellick cjellick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left two comments, but they are just non-blocking questions

Co-authored-by: Darren Shepherd <[email protected]>
@iwilltry42 iwilltry42 merged commit bf3fde2 into main Oct 29, 2024
1 check passed
@iwilltry42 iwilltry42 deleted the change/otto branch October 29, 2024 05:07
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants